fix(cross-repo): emit HTTP_CALLS for unindexed client libs and normalize URLs for route matching (#523)#536
Conversation
…ize URLs for route matching (DeusData#523) - pass_calls.c: detect known HTTP/async client patterns (requests, httpx, axios, etc.) by service pattern match even when the target node doesn't exist in the graph (external dep not indexed). Fixes zero-edge output on normal repos where HTTP clients are pip/npm dependencies. - pass_cross_repo.c: strip scheme+host+port from consumer url_path before QN lookup (cr_url_path). Add path-param template matching so concrete paths (/v2/orders/123) match provider route templates (/v2/orders/{id}). Add reverse-direction match so HTTP_CALLS in the consumer DB are found when cross-repo is run from the provider side. Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
|
Built this branch on macOS (arm64) and traced resolve_single_call to see exactly what happens. The two halves behave differently: pass_cross_repo.c (URL normalize + template match): works. Holding an HTTP_CALLS edge constant (url_path = the full URL http://order-service:8080/v2/orders/123), main links nothing and this branch links it to the templated route /v2/orders/{id}. Clean before/after. pass_calls.c (emit without the client lib indexed): doesn't fire for a genuinely external requests. The call reaches resolve_single_call and first_string_arg holds the URL, so that part is fine. But with requests not installed or vendored anywhere indexable, the import map is empty (imp_count=0), so cbm_registry_resolve returns an empty qualified_name and the function returns at The moment requests is locally resolvable (a vendored stub or an installed venv), imp_count=1, res_qn becomes '...requests.get', svc=1, and it emits, but that is the resolved path, which main emits on too. So the emit-without-target path seems to help only when the callee resolves to a QN that has no node, not when the external client resolves to nothing. Was requests pip-installed in your WSL repro? If so, cross_http_calls: 1 is the matcher fix plus a resolvable call, and the index-just-my-service case (the #523 scenario) is still 0. Happy to share the repro. |
DeusData#523) The previous emit-without-target path sat after the empty-QN early return, so a genuinely external client (requests/axios not installed or vendored) bailed at the empty-QN return before reaching it. The import map is empty, cbm_registry_resolve returns no QN, and there was nothing for cbm_service_pattern_match to classify. Move the detection into the empty-QN branch and classify from the raw callee name (requests.get -> HTTP, GET) instead of the resolved QN. Verified without any vendored stub: HTTP_CALLS now fires and cross-repo links the call to the provider templated route (cross_http_calls: 1). Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
|
Good catch you were exactly right. The emit sat after the empty-QN early return, so a genuinely external Fixed in the latest commit: the detection now lives in the empty-QN branch and classifies from the raw callee name (
One caveat worth flagging separately: a single-file provider still returns 0, but for an unrelated reason FastAPI route extraction ( thanks for tracing this on your end ;) |
|
Re-validated 0a8a44f on macOS (arm64), with a genuinely external requests (no stub, no install, consumer indexes only its own service):
So the empty-QN path is fixed. Confirmed on my end. On the single-file caveat: I'm not reproducing it here. My provider is a single app.py with two routes (@app.get + @app.post), both Route nodes extracted fine, which is why the end-to-end run above links. So the no-route-on-single-file behavior may be platform-specific or a narrower trigger than file count, rather than a general <50-file thing. Didn't block the cross-repo case for me, but happy to share details if you open a separate issue for it. Nice work turning this around so fast. |
|
Thanks for re-validating ;) You're right to push back on the single-file theory if your single Appreciate the thorough trace throughout it made both fixes tighter. |
|
Thanks @RithvikReddy0-0 — the unindexed-client + URL-normalization direction is right. Two things before this can land:
Also, |
…DeusData#523) Addresses review on DeusData#536. insert_cross_edge now skips insertion when an identical (source_id, target_id, type) edge already exists. The pass reaches the same caller/route pair from both directions and emit_cross_route_bidirectional writes both DBs, so without this guard the same CROSS_HTTP_CALLS pair was re-emitted and inflated http_edges. Verified idempotent: repeated runs and runs from either project side both yield cross_http_calls: 1 with exactly one edge per DB. Documented why emit_http_async_edge is called with source_node as both source and target in the unindexed-external-client path. Signed-off-by: RithvikReddy0-0 <rithvikreddymukkara@gmail.com>
|
Thanks for the review. Addressed all three in 4817d79. 1. Duplicate edges. Fixed via an idempotency guard in
One thing worth raising on the "single direction" suggestion: I don't think dropping the reverse Tradeoff I want to flag: the guard adds a 3. Self-pass comment. Added documents that the external client has no graph node, so 2. Test. This is the one I'd like a pointer on. |
Fixes two root causes of cross-repo-intelligence returning 0 edges (#523).
pass_calls.c
HTTP client calls (requests, httpx, axios, etc.) were silently dropped when
the client library wasn't indexed (external pip/npm dep). The callee resolved
to a QN but cbm_gbuf_find_by_qn returned NULL, so the call was discarded
before HTTP classification.
Fix: detect known HTTP/async patterns via cbm_service_pattern_match and emit
the edge even without a target node in the graph.
pass_cross_repo.c
Three issues in match_http_routes:
Consumer url_path carried full URL (scheme+host+port); provider Route has
bare path. Added cr_url_path() to strip scheme+authority before QN lookup.
Concrete paths (/v2/orders/123) never matched templated routes
(/v2/orders/{id}). Added cr_path_matches_template() and
find_route_handler_fuzzy() for segment-level template matching.
match_http_routes only searched HTTP_CALLS in the src project. When
cross-repo is run from the provider side, HTTP_CALLS live in the consumer
DB. Added reverse direction call so both orientations are covered.
Repro
Verified manually: FastAPI provider + requests consumer, cross-repo-intelligence
now returns cross_http_calls: 1 where it previously returned 0.
Checklist
git commit -s) — required, CI rejectsunsigned commits (DCO, see CONTRIBUTING.md)
make -f Makefile.cbm test)make -f Makefile.cbm lint-ci)